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(S) Video signal decoding. 



(57) A digital video signal that has been encoded 
using motion- compensated prediction, trans- 
form encoding, and variable-length coding, is 
decoded using parallel processing. Frames of 
the video signal are divided into slices (1, 2, 3, 4) 
made up of a sequence of macroblocks (MB). 
The signal to be decoded is slice-wise divided 
for parallel variable-length decoding. Each vari- 
able-length-decoded macroblock is divided into 
its constituent blocks for parallel inverse trans- 
form processing. Resulting blocks of difference 
data are added in parallel to corresponding 
blocks of reference data. The blocks of refer- 
ence data corresponding to each macroblock 
are read out in parallel from reference data 
^ memories (44, 45, 46, 47) on the basis of a 
^ motion vector (83) associated with the macrob- 
lock. Reference data corresponding to each 
macroblock is distributed for storage among a 
C*) number of reference data memories. 
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This invention relates to decoding of prediction- 
coded video signals, and more particularly is directed 
to the application of parallel processing to such de- 
coding. 

It is known to perform compression coding on vid- 5 
eo data which represents a moving picture in order to 
reduce the quantity of data to be recorded and/or 
transmitted. Such data compression may be useful, 
for example, in recording/reproducing systems using 
recording media such as magnetic tape or optical 10 
disks, and is also useful in transmission systems such 
as those used for video teleconferencing, video tele- 
phones, television broadcasting (including direct sat- 
ellite broadcast), and the like. For example, it has 
been proposed by the Moving Picture Experts Group 15 
(MPEG) to compression-code moving picture video 
data utilizing motion-compensated prediction, trans- 
form processing using an orthogonal transformation 
such as the discrete cosine transform (DCT), and va- 
riable-length coding. A system for decoding and re- 20 
producing such compression-coded video data is illu- 
strated in block diagram form in Figure 14 of the ac- 
companying drawings. 

As shown in Figure 14, a sequence of compres- 
sion-coded video data is provided at an input terminal 25 
101 for processing, in turn, by an inverse VLC (vari- 
able-length coding) circuit 102, an inverse quantiza- 
tion circuit 103, and an inverse DCT circuit 104. An 
adding circuit 1 05 forms a reconstructed frame of vid- 
eo data on the basis of a difference signal provided 30 
from the inverse DCT circuit 104 and predictive pic- 
ture data (reference data) provided from a motion 
compensation circuit 106. The resulting reconstruct- 
ed video data is stored in a frame memory 107. 

The motion compensation circuit 106 forms the 35 
predictive picture data from reconstructed data pre- 
viously stored in frame 107 on the basis of motion 
compensation information (including, for example, 
motion vectors) extracted from the input signal and 
supplied to the motion compensation circuit 106 by 40 
the inverse VLC circuit 102. Alternatively, with re- 
spect to frames for which predictive coding was not 
performed, such as "intra-f rame" coded data, the mo- 
tion compensation circuit 106 simply provides the val- 
ue "0" to the adder 1 05. Reconstructed frames of vid- 45 
eo data are output from the frame memory 107 via a 
digital-to-analog converter 108 for display by a dis- 
play device 109. 

As the number of pixels in each frame of the video 
signal has increased from, for example, the 352 x 240 50 
frame used for video telephones to the 720 x 480 
frame used in the NTSC format or the 1920 x 1024 
frame in a HDTV (high definition television) system, 
it was found to be difficult to perform the necessary 
processing using only one processor and one pro- 55 
gram execution sequence. For this reason, it has 
been proposed to divide each frame of the video data 
into a plurality of subf rames, as illustrated in Figure 



16 of the accompanying drawings, and then to provide 
a respective processor for each of the plurality of sub- 
frames, so that coding and decoding are performed 
with parallel processing by the plurality of processors. 
For example, Figure 15 of the accompanying draw- 
ings is a block diagram of a decoding system provided 
in accordance with this proposal. 

In the system of Figure 1 5, input sequences of en- 
coded video data, each representing a respective 
subframe, are respectively provided via input termi- 
nals 110-113 to processors (decoder blocks) 114-117. 
The processors 114-117 decode the respective data 
sequences based upon data supplied from frame 
memories 119-122, which store respective sub- 
frames and are assigned to respective ones of the 
processors 114-117. For example, processor 114 
stores a subframe of decoded data in the memory 
119. In order to provide motion compensation, a 
switching logic circuit 118 provided between the proc- 
essors 114-177 and the frame memories 119-122, 
permits the processor 114 to read out data from an 
adjacent portion of the frame memory 120 as well as 
from all of frame memory 119. The switching logic cir- 
cuit 118 also provides frames of output video data 
from the memories 119-120, via a digital -to- analog 
converter 123 for display on a display device 124. 

The four data sequences respectively provided to 
the processors 114-117 can, for practical purposes, 
be combined into a single data sequence by providing 
headers for controlling multiplexing of the data se- 
quence. For this purpose, a separation block (not 
shown) is provided upstream from the decoder for 
separating the combined data sequence into the four 
sequences to be provided to the respective proces- 
sors. Examples of parallel processing techniques 
which use division of a video frame into subframes 
are disclosed in U.S. Patent No. 5,138,447 and Jap- 
anese Patent Application Laid Open No. 
139986/1992 (Tokkaihei 4-139986). 

As just described, according to the conventional 
approach, the video frame was generally divided into 
subframes which were processed in parallel by re- 
spective processors. However, when a frame is div- 
ided in this manner, there are restrictions on the ex- 
tent to which the processors can access data that is 
outside of the processor's respective subframe. Al- 
though, as indicated above, a processor can access 
a region that adjoins its respective subframe, the ex- 
tent of such access is limited in order to keep the scale 
of the switching logic circuit 118 from becoming undu- 
ly large. As a result, the degree of compression effi- 
ciency is reduced, and there are variations in the qual- 
ity of the reproduced picture at the boundary between 
the subframes, which may result in visible artifacts at 
the subframe boundary. 

In addition, the processing for compression-cod- 
ing is carried out completely separately for each of the 
subframes, which makes it impossible to provide 



BNSDOCID: <EP 0614317A2_I_> 



EP 0 614 317 A2 



compression-coding on the basis of data blocks in 
other subf ram es, a limitation that is not present when 
the frame is not divided into subframes. Accordingly, 
the compression coding method must be changed to 
accommodate the division into subframes, resulting 
in a lack of compatibility and a loss in compression ef- 
ficiency. 

Furthermore, if header data is added to the data 
sequence to be recorded or transmitted in order to 
provide for multiplexing the data sequence into the re- 
spective sequences provided to the parallel proces- 
sors, the additional header data increases the over- 
head in the recorded data with a corresponding loss 
of efficiency, and it may also be necessary to change 
the coding procedure, and so forth. 

In accordance with a first aspect of the present in- 
vention, there is provided an apparatus for decoding 
a coded video signal that represents an image frame, 
said coded video signal having been divided into a 
plurality of slices each of said slices being a sequence 
of macroblocks, each of said macroblocks being a 
two-dimensional array of picture elements of said im- 
age frame, said coded video signal being a bit stream 
that represents a sequence of said slices which to- 
gether represent said image frame, said bit stream in- 
cluding a plurality of synchronizing code signals, each 
of which is associated with a respective one of said 
slices for indicating a beginning of the respective 
slice, the apparatus comprising: 

a plurality of decoding means each for decod- 
ing a respective portion of said coded video signal 
that represents said image frame; and 

distributing means responsive to said syn- 
chronizing code signals for distributing said slices 
among said plurality of decoding means. 

According to a second aspect of the invention, 
there is provided an apparatus for decoding input sig- 
nal blocks that were formed by transform encoding 
and then variable-length encoding blocks of video 
data, the apparatus comprising: 

decoding means for variable-length decoding 
a series of said input signal blocks; 

parallel data means for forming plural parallel 
data streams, each of which includes respective ones 
of said series of input signal blocks which were vari- 
able-length decoded by said decoding means; and 

a plurality of inverse transform means each for 
receiving a respective one of said parallel data 
streams and for performing inverse transform proc- 
essing on the variable-length decoded signal blocks 
in the respective data stream. 

In preferred embodiments of the apparatus just 
described, the decoding circuit is one of a plurality of 
decoding circuits for variable-length decoding re- 
spective series of input signal blocks, and the appa- 
ratus further includes a distributing circuit for forming 
the respective series of input signal blocks to be de- 
coded by the plural decoding circuits from a bit 



stream representing an image frame, and the respec- 
tive series of input signal blocks are formed in re- 
sponse to synchronizing signals provided at predeter- 
mined intervals in the bit stream representing the im- 
5 age frame. 

According to a third aspect of the invention, there 
is provided an an apparatus for decoding an input dig- 
ital video signal which includes groups of blocks of 
prediction-coded difference data, each of said groups 
10 consisting of a predetermined plurality of said blocks 
and having a respective motion vector associated 
therewith, each of said blocks of prediction-coded dif- 
ference data having been formed on the basis of the 
respective motion vector associated with the respec- 
ts tive group which includes said block, the apparatus 
comprising: 

output means for supplying in parallel blocks of 
prediction-coded difference data contained in one of 
said groups of blocks; 

20 reference data means for supplying in parallel 

plural blocks of reference data, each of said blocks of 
reference data being formed on the basis of the mo- 
tion vector associated with said one of said groups of 
blocks and corresponding to one of said blocks of pre- 

25 diction-coded difference data supplied by said output 
means; and 

a plurality of adding means each connected to 
said output means and said reference data means for 
adding a respective one of said blocks of prediction- 

30 coded difference data and the corresponding block of 
reference data. 

In preferred embodiments of the invention, the 
reference data circuit includes a plurality of reference 
data memories from which reference data is read out 

35 in parallel on the basis of the motion vector associat- 
ed with that group of blocks, a plurality of buffer mem- 
ories for temporarily storing reference data read out 
from the plurality of reference data memories and a 
distribution circuit According to one alternative em- 

40 bodiment of this aspect of the invention, each of the 
buffer memories is associated with a respective one 
of the reference data memories and is controlled on 
the basis of the motion vector for reading out the ref- 
erence data temporarily stored therein, and the dis- 

45 tributing circuit is connected between the buffer 
memories and the adding circuits and distributes the 
reference data stored in the buffer memories among 
the adding circuits on the basis of the motion vector. 
According to another alternative embodiment of this 

so aspect of the invention, each of the buffer memories 
is associated with one of the adding circuits and the 
distributing circuit is connected between the refer- 
ence data memories and the buffer memories for dis- 
tributing among the buffer memories, on the basis of 

55 the motion vector associated with that group of 
blocks, the reference data read out from the reference 
data memories. 

According to a fourth aspect of the invention, 
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there is provided a method of decoding a coded video 
signal that represents an image frame, said coded 
video signal having been divided into a plurality of 
slices each of said slices being a sequence of macro- 
blocks, each of said macroblocks being a two-dimen- 5 
sional array of picture elements of said image frame, 
said coded video signal being a bit stream that repre- 
sents a sequence of said slices which together repre- 
sent said image frame, said bit stream including a 
plurality of synchronizing code signals, each of which 10 
is associated with a respective one of said slices for 
indicating a beginning of the respective slice, the 
method comprising the steps of: 

providing a plurality of decoding means each 
for decoding a respective portion of said coded signal 15 
that represents said video frame; and 

distributing said slices among said plurality of 
decoding means in response to said synchronizing 
code signals. 

The data representing each macroblock may be 20 
distributed block-by-block among the plurality of 
memories or line-by-line in a cyclical fashion among 
the plurality of memories. 

A video signal decoding apparatus may be pro- 
vided in which the input coded signal is distributed for 25 
parallel processing among several decoding circuits 
on the basis of synchronizing code signals that are 
provided in the signal in accordance with a conven- 
tional coding standard. In this way, parallel decoding 
can be precisely carried out on the basis of synchron- 30 
izing signals provided in accordance with a conven- 
tional coding method and during time periods avail- 
able between the synchronizing signals. In this way, 
restrictions on the conventional coding method can 
be reduced. 35 

In addition, the data may be sequenced on the 
basis of "slices" which are a standard subdivision of 
a video frame constituting a plurality of macroblocks 
and the slices of data are distributed among decoding 
circuits so that high speed parallel decoding may be 40 
carried out. 

Further, each of the blocks making up a macro- 
block may be distributed to a respective inverse 
transformation circuit so that inverse transform proc- 
essing can be carried out simultaneously in parallel 45 
for all of the blocks of a macroblock, and the inverse 
transform blocks are then combined, in parallel, with 
reference data to recover the video signal which had 
been predictive-coded. The reference data, in turn, 
may be provided from parallel memories at the same so 
time on the basis of the motion compensation vector 
for the particular macroblock, and in such a way that 
there is no need to place restrictions on the motion- 
compensation carried out during the predictive cod- 
ing. For example, there is no need to limit the range 55 
of the motion vector. 

Embodiments of the invention will now be descri- 
bed, by way of exam pie only, with reference to the ac- 



companying drawings, in which: 

Figure 1 is a block diagram of an embodiment of 
an apparatus for decoding a moving picture video 
data signal; 

Figure 2 is a schematic illustration of a manner in 
which video data corresponding to an image 
frame is distributed for decoding; 
Figure 3 is a timing diagram which illustrates op- 
eration of a buffer memory provided in the appa- 
ratus of Figure 1; 

Figure 4 is a block diagram which illustrates a 
code buffering arrangement provided upstream 
from variable-length decoder circuits provided in 
the apparatus of Figure 1; 

Figure 5 is a timing diagram which illustrates op- 
eration of the code buffering arrangement shown 
in Figure 4; 

Figure 6 is a block diagram which shows an alter- 
native code buffering arrangement provided up- 
stream from variable-length decoder circuits pro- 
vided in the apparatus of Figure 1; 
Figure 7 is a timing diagram which illustrates op- 
eration of the code buffering arrangement shown 
in Figure 6; 

Figures 8(A), 8(B) and 8(C) together schematical- 
ly illustrate a manner in which reference data is 
provided on the basis of a motion vector to adders 
that are part of the apparatus of Figure 1 ; 
Figure 9 is a timing diagram which illustrates an 
operation for providing reference data to the ad- 
ders which are part of the apparatus of Figure 1; 
Figure 10 is a block diagram of another embodi- 
ment of an apparatus for decoding a moving pic- 
ture video data signal; 

Figure 11(A), 11(B) and 11(C) together schemat- 
ically illustrate a manner in which reference data 
is provided on the basis of a motion vector to ad- 
ders that are part of the apparatus of Figure 10; 
Figures 12(A) and 12(B) together schematically 
illustrate an alternative manner in which refer- 
ence data is provided on the basis of a motion 
vector to the adders which are part of the appa- 
ratus of Figure 10; 

Figure 13 is timing diagram which illustrates an 
operation for providing reference data according 
to the example shown in Figure 12; 
Figure 14 is a block diagram of a conventional ap- 
paratus for decoding and reproducing a moving 
picture video data signal; 

Figure 15 is a block diagram of a portion of a con- 
ventional apparatus for decoding and reproduc- 
ing a moving picture video data signal by means 
of parallel processing; and 
Figure 16 schematically illustrates operation of 
the conventional decoding apparatus of Figure 
15. 

A preferred embodiment of the invention will now 
be described, initially with reference to Figure 1. 
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Figure 1 illustrates in block diagram form an ap- 
paratus for decoding a moving picture video data sig- 
nal that has been coded according to a proposed 
MPEG standard system. 

An input bit stream representing the coded video 5 
data signal is provided to a demultiplexer 25, by 
means of which the input signal is distributed, slice- 
by-slice, to code buffers 26-29. 

Figure 2 illustrates the si ice- by-slice distribution 
of the input data. As is well known to those who are 10 
skilled in the art, each slice is a sequence of macro- 
blocks transmitted in raster scanning order. The start- 
ing point of each slice is indicated by a synchronizing 
code signal, and the slices are provided so that trans- 
mission errors and the like can be confined to a single 15 
slice, because after an error occurs, proper coding 
can resume at the synchronizing code signal provided 
at the beginning of the subsequent slice. Accordingly, 
the demultiplexer 25 is provided with a circuit which 
detects the synchronizing code signals, and distribu- 20 
tion of the input signal among the code buffers 26-29 
is carried out in response to the detected synchroniz- 
ing code signals. 

As is also well known, the motion vectors provid- 
ed with respect to each macroblock, and the DC coef- 25 
f icients for each block, are differentially encoded. In 
other words, only the difference between respective 
motion vectors for the current macroblock and the 
preceding macroblock is encoded and transmitted, 
and also, only the difference between the respective 30 
DC coefficient for the present block and that of the 
preceding block are coded and transmitted. 

As indicated in Figure 2, the first, fifth, ninth, etc. 
slices of each image frame are stored in the first code 
buffer 26, and these slices are provided for variable- 35 
length decoding by a variable-length decoder circuit 
30. Similarly, the second, sixth, tenth, etc. slices of 
the image frame are stored in the second code buffer 
27 for variable-length decoding by variable-length de- 
coder circuit 31 ; the third, seventh, eleventh, etc. slic- 40 
es are stored in the third code buffer 28 for variable- 
length decoding by the variable-length decoder circuit 
.32; and the fourth, eighth, twelfth, etc. slices are stor- 
ed in the fourth code buffer 29 for variable-length de- 
coding by the variable-length decoder circuit 33. 45 

According to the example shown in Figure 2, the 
number of macroblocks in each slice is fixed, so that 
it will not be necessary for any of the variable-length 
decoders to wait. As result, decoding carried on by the 
variable-length decoders is synchronized and is car- so 
ried out efficiently. 

It will be understood that, although the number of 
macroblocks per slice is fixed, the number of bits per 
slice in the input signal will vary because of variable- 
length encoding. Nevertheless, the number of macro- 55 
blocks per slice output by each variable-length decod- 
ing circuit is the same according to this example. 

In the example shown in Figure 2, each slice is 



shown as being one macroblock high and extending 
horizontally entirely across the image frame, so that 
each slice consists of one row of macroblocks. How- 
ever, it is also within the contemplation of this inven- 
tion to provide for slices having a fixed length in terms 
of macroblocks that is longer or shorter than one row 
of macroblocks. It is further contemplated that the 
number of macroblocks per slice may be variable 
within each frame and/or from frame to frame and 
that the positions of slices within a frame may vary. 
In case variable-length slices are provided within a 
frame, it will be appreciated that the number of mac- 
roblocks distributed to each of the variable-length de- 
coders may be unbalanced, in which case some of the 
variable-length decoders may be required to output 
filler macroblocks (all zeros for example) until other 
decoders have "caught up". Furthermore, it is provid- 
ed that variable-length decoding of slices from the 
next image frame will not proceed until all of the slic- 
es of the current frame have been variable-length de- 
coded. 

It will be recognized that any loss of decoding ef- 
ficiency that results from the occasional need to in- 
terrupt the processing by some of the variable length 
decoders is compensated for by the fact that the cod- 
ing can be performed with slices that have a variable 
length in terms of macroblocks. 

Details of the variable-length decoding process- 
ing will now be described. 

Data which has been decoded by the respective 
variable length decoders are transferred to buffer 
memories 35-38 by way of switcher 34. Figure 3 illus- 
trates the manner in which data is distributed to, and 
output from, the buffer memories 35-38. It will be not- 
ed that, upstream from the buffers 35-38, processing 
had been performed in a slice-wise parallel manner, 
but downstream from the buffers 35-38 processing is 
performed in a block-wise parallel manner. In partic- 
ular, the four blocks of luminance data making up a 
macroblock are output in parallel from respective 
ones of the buffer memories 35-38. (It will be under- 
stood that a macroblock also includes chrominance 
blocks. For example, in the 4:2:2 format, each mac- 
roblock includes four blocks of chrominance data in 
addition to the four blocks of luminance data. The dis- 
cussion from this point forward will deal only with the 
luminance data blocks, it being understood that the 
corresponding four chrominance data blocks can be 
processed in a similar manner.) 

Referring again to Figure 3, it will be seen that the 
variable length decoders 30-33 respectively output 
simultaneously the respective first block of the first 
through fourth slices. The respective first blocks are 
distributed among the buffer memories 35-38 so that 
the first block of the first slice (i.e., the first block of 
the first macroblock of the first slice) is stored in the 
first buffer memory 35, the second block of the first 
slice is stored in the second buffer memory 36. the 
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third block of the first slice is distributed to the third 
buffer memory 37, and the fourth block of the first 
slice is distributed to the fourth buffer memory 38. As 
a result, all four blocks of a single macroblock can be 
read out in parallel by the respective buffer memories 
35-38, so that block-wise parallel processing can be 
accomplished downstream. Such processing in- 
cludes conventional inverse transform processing in 
accordance with zig-zag scanning. 

In the example just discussed, each buffer mem- 
ory preferably has two banks which each have the ca- 
pacity of storing four data blocks. 

The block-wise parallel data provided from the 
buffer memories 35-38 is subjected to inverse quan- 
tization and inverse discrete cosine transform proc- 
essing in parallel at processing blocks 39-42. There- 
after, motion compensation processing for the four 
blocks of the macroblock is also carried out in parallel. 
Reference picture data for each macroblock is ex- 
tracted from previously reproduced (i.e., previously 
reconstructed) image data stored in a frame memory 
43. The reference picture data is formed on the basis 
of the motion vector which corresponds to the macro- 
block being processed and is used to form decoded 
data in combination with difference data output from 
the processing blocks 39-42. In this example, since 
motion compensation processing is carried out in par- 
allel for each macroblock (four blocks) of luminance 
data, the motion vectors provided to motion compen- 
sation processing blocks 53-56 from the variable 
length decoders 30-33 always correspond to each 
other at any given time. For this reason, an MC (mo- 
tion compensation) switcher 52 is used to switch a 
data bus, so that it is possible to provide motion com- 
pensation processing of the reference data transfer- 
red to MC buffer memories 48-51 in such a manner 
that memory accessing by the motion compensation 
processing blocks 53-56 does not overlap. As a re- 
sult, the motion compensation search range, and ac- 
cordingly the permissible range of the motion vector, 
is not limited. Details of motion compensation proc- 
essing will be provided below. 

Reproduced decoded image data formed in par- 
allel at adders 57-60 is stored via four parallel proc- 
essing paths in the frame memory 43 by way of stor- 
age buffers 61-64. Moreover, sequences of images 
for which the reproduced (reconstructed) data is stor- 
ed in memory 43 are output to a digital-to-analog con- 
verter 99 through display buffer memories 94-97, and 
a display switcher 98 which is switched according to 
appropriate display timing. The D/A converted signal 
is then displayed on a display device 100. 

There will now be described, with reference to 
Figure 4, details of a buffering arrangement provided 
upstream from the variable length coders of the ap- 
paratus of Figure 1 . 

As shown in Figure 4, an input signal bit stream 
is received at an input terminal 65 and provided there- 
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from to a demultiplexer 66 which divides the bit 
stream at the beginning of each slice and distributes 
the slices among code buffer memories 67-70. The 
slices of data are output respectively from the code 

5 buffer memories 67-70 to variable-length decoders 
71-74, and variable-length decoded data is respec- 
tively output from each of the variable-length decod- 
ers 71-74 via output terminals 75-78. 

The buffering and decoding operations carried 

10 out by the circuitry shown in Figure 4 will now be de- 
scribed with reference to the timing diagram shown in 
Figure 5. 

In particular, the input bit stream received at the 
terminal 65 is divided at the beginning of each slice 
15 by the demultiplexer 66. Because synchronizing code 
signals indicative of the beginning of each slice are in- 
cluded at intervals corresponding to a plural number 
of macroblocks (such intervals each being referred to 
as a slice), the synchronizing code signals are detect- 
20 ed at the demultiplexer 65 for the purpose of perform- 
ing the division of the bit stream into slices. 

As shown in Figure 5, a sequence of the resulting 
slices are written in a cyclical fashion into the code 
buffer memories 67-70. In particular, slice 1, slice 5, 
25 slice 9, etc. are written into the code buffer memory 
67; slice 2, slice 6, slice 10, etc. are written into the 
code buffer memory 68; slice 3, slice 7, slice 11 , etc. 
are written into the code buffer memory 69; and slice 
4, slice 8, slice 12, etc. are written into the code buffer 
30 memory 70. 

At a point when slice 4 has been written into the 
code buffermemory 70, the slices 1-4 are respective- 
ly read out in parallel from the code buffer memories 
67-70 to the four variable-length decoders 71-74 and 
35 variable-length decoding begins. 

The variable-length decoders 71-74 each com- 
plete decoding processing of a macroblock from a re- 
spective slice within the same time. Decoded data 
produced by variable-length decoder 71 is output via 
40 terminal 75; decoded data produced by variable- 
length decoder 72 is output via terminal 76; decoded 
data produced by variable-length decoder 73 is out- 
put via terminal 77; and decoded data produced by 
variable-length decoder 74 is output via terminal 78. 
45 All of the decoded data is supplied to the switcher 34 
(Figure 1). In addition, decoded motion vector data is 
provided from the variable-length decoders to the MC 
switcher 52 and motion compensation processing 
blocks 53-56. 

so It should be understood that, in Figure 5, the sym- 

bol "1-1" shown in the output of IVLC1 (variable- 
length decoder 71) is indicative of the first block of 
slice 1. Similarly, for example, "4-1 "shown in the out- 
put of IVLC4( variable-length decoder 74) is indicative 

55 of the first block of slice 4. 

An alternative code buffering arrangement pro- 
vided upstream from the variable-length decoders is 
shown in Figure 6. 

6 
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In Figure 6, the input bit stream is again received 
at an input terminal 65 and provided therefrom to a 
demultiplexer 79, at which the bit stream is divided at 
the beginning of each slice. Immediately downstream 
from the demultiplexer 79 is a code buffer memory 80 5 
which has respective regions in each of which a slice 
of data can be stored. Additional buffer memories 90- 
93 are provided downstream from the buffer memory 
80. In a similar manner to the arrangement of Figure 
4, the buffered data output from each of the buffer 10 
memories 90-93 is provided to a respective one of the 
variable-length decoders 71-74, and the decoded 
data output from the variable-length decoders 71-74 
is provided at respective output terminals 75-78. 

Operation of the code buffering arrangement 15 
shown in Figure 6 will now be described with refer- 
ence to the timing diagram of Figure 7. 

As before, the input bit stream provided from the 
terminal 65 is divided at the beginning of each slice 
by the demultiplexer 79 on the basis of synchronizing 20 
code signals provided at intervals corresponding to a 
number of macroblocks. 

As shown in Fig. 7, respective slices are written 
in a cyclical fashion into the regions 1-4 of the buffer 
memory 80. In particular, slice 1, slice 5, slice 9, etc. 25 
are written into region 1; slice 2, slice 6, slice 10, etc. 
are written into region 2; slice 3, slice 7, slice 11, etc. 
are written into region 3; and slice 4, slice 8, slice 12, 
etc. are written into the region 4. 

At a point when slice 4 has been written into re- 30 
gion 4, the data stored in the four regions are sequen- 
tially read out from the code buffer memory 80. As a 
result, slices 1,5, 9, etc. are read out from region 1 
and written into buffer memory 90; slices 2, 6, 1 0, etc., 
are read out from region 2 and written into buffer 35 
memory 91; slices 3, 7, 11, etc. are read out from re- 
gion 3 and written into buffer memory 92, and slices 
4, 8, 12, etc. are read out from region 4 and written 
into buffer memory 93. 

At a time when the contents of region 4 have 40 
been written into the buffer memory 93, the data re- 
spectively stored in the buffer memories 90-93 is read 
out in parallel to the variable-length decoders 71-74, 
and decoding processing starts at that time. 

The variable-length decoders 71-74 each com- 45 
plete the decoding processing of a respective macro- 
block within the same time. Decoded data produced 
by variable length decoder 71 is output via terminal 
75; decoded data produced by variable-length decod- 
er 72 is output via terminal 76; decoded data pro- so 
duced by variable-length decoder 73 is output via ter- 
minal 77; and decoded data produced by variable- 
length decoder 74 is output via terminal 78. This de- 
coded data is supplied to the switcher 34, and in ad- 
dition, decoded motion vector data is supplied from 55 
the variable-length decoders to MC switcher 52 and 
to motion compensation processing blocks 53-56. 

As was the case with Figure 5, in Figure 7 the 



symbol "1-1" is indicative of the first block in slice 1, 
which is decoded by variable-length decoder 71, 
while "4-1" is indicative of the first block of slice 4, 
which is decoded by the variable-length decoder 74. 

With respect to the buffering arrangement shown 
in Figure 4, it is possible to use certain distribution 
methods with respect to input data streams which 
have a processing unit which is shorter than a slice 
and are included in a layer (known as an "upper lay- 
er") which has processing units which are longer than 
a slice. With respect to an input data stream which 
has such a format, it is possible to simultaneously 
write the upper layer into the code buffer memories 
67-70 in order to provide parallel data to the variable- 
length decoders 71-74. Alternatively, the bit stream 
for the upper layer can be written into one of the four 
code buffer memories so that the upper layer is de- 
coded by only one of the four variable-length decod- 
ers, with parameters being set at the other variable- 
length decoders. According to another possible meth- 
od, an additional processor is provided to decode the 
upper layer bit stream so as to set parameters at the 
four variable-length decoders. 

On the other hand, using the arrangement shown 
in Figure 6, the upper layer bit stream can be written 
into one of the four regions of the buffer memory 80 
and the contents of that region can be simultaneously 
written into the buffer memories 90-93 for parallel 
processing by the variable-length decoders 71-74. 
According to an alternative method, the upper layer 
bit stream is written into one of the four regions of the 
buffer memory 80 so that the data is written into one 
of the four buffer memories 90-93 and is then decod- 
ed by one of the four variable-length decoders in or- 
der to set parameters at the other variable-length de- 
coders. 

According to another alternative method, a sep- 
arate processor is provided to decode the upper layer 
bit stream in order to set parameters at the four vari- 
able-length decoders. As a further method, the de- 
multiplexer 79 repeatedly writes the upper layer bit 
stream into the four regions of the buffer memory 80 
so that the data is simultaneously written from each 
region into the buffer memories 90-93 for parallel 
processing in the variable-length decoders 71-74. 

In these ways, distribution of the data stream, 
and parallel processing thereof, can be carried out on 
the basis of parameters included in the data stream. 

Details of decoding processing with respect to 
motion-compensated predictive-coded data will now 
be described. 

Figure 8(A) illustrates a manner in which refer- 
ence image data is distributed among and stored in 
DRAMs 44-47 making up the frame memory 43. Each 
image frame is, as indicated above, divided into mac- 
roblocks, and each macroblock is formed of four 
blocks. Each of the four blocks is, in this particular ex- 
ample, an 8 x 8 array of pixel elements, and each of 
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the blocks constitutes one of four quadrants of its re- 
spective macroblock. The data with respect to each 
macroblock is divided among the four DRAMs 44-47. 
In particular, all of the first blocks (upper left blocks) 
of all of the macroblocks are stored in DRAM 44, all s 
of the second blocks (upper right blocks) of all of the 
macroblocks are stored in DRAM 45, all of the third 
blocks (lower left blocks) of all of the macroblocks are 
stored in DRAM 46, and all of the fourth blocks (lower 
right blocks) of all of the macroblocks are stored in 10 
DRAM 47. Accordingly, it will be seen that the refer- 
ence data is distributed among DRAMs 44-47 in a 
checkered pattern. 

Continuing to refer to Figure 8(A), the square lab- 
elled 81 represents the geometric area of the image 15 
frame which corresponds to the macroblock which is 
currently being decoded (reconstructed), and refer- 
ence numeral 82 represents the motion vector asso- 
ciated with that macroblock, according to the example 
shown in Figure 8(A). In addition, the reference nu- 20 
meral 83 represents the reference data stored in the 
DRAMs 44-47 and indicated by the motion vector 82 
as corresponding to the current macroblock 81. The 
data represented by the shaded square 83 is read out 
from the DRAMs 44-47 under control of motion com- 25 
pensation processing blocks 53-56 on the basis of the 
motion vector 82. In particular, the data correspond- 
ing to the T>RAM1" portion of the square 83 (i.e., a 
central portion of the square 83) is read out from 
DRAM 44 to motion compensation buffer 48 under 30 
the control of motion compensation processing block 
53. Similarly, the portions of the shaded square 83 
which overlap with squares labelled W DRAM2" (i.e., 
central portions of the left and right sides of the 
square 83) are read out from DRAM 45 to motion 35 
compensation buffer 49 under control of motion com- 
pensation processing block 54. Also, the portions of 
the shaded square 83 which overlap the squares lab- 
elled "DRAM3" (i.e., the central portions of the upper 
and lower edges of the square 83) are read out from 40 
DRAM 46 to motion compensation buffer 50 under 
control of motion compensation processing block 55. 
Finally, the portion of the shaded square 83 which 
overlaps with squares labelled "DRAM4" (i.e., corner 
reg ions of the square 83) are read out from the DRAM 45 
47 to motion compensation buffer 51 under control of 
motion compensation processing block 56. 

Figure 8(B) is a schematic illustration of the ref- 
erence data read out from the respective DRAMs 44- 
47 and stored in respective motion compensation buf- so 
fers 48-51. This data stored in the four motion com- 
pensation buffers 48-51 represents the reference 
data for the macroblock which is currently to be recon- 
structed. However, the data as stored in the individual 
motion compensation buffers does not correspond to 55 
the data required for each of the adders 57-60. There- 
fore, the MC switcher 52 is provided between the mo- 
tion compensation buffers 48-51 and the adders 57- 

8 



60 so that the correct reference data is distributed 
from the motion compensation buffers to the adders. 
The reference data which is supplied to each of the 
adders 57-60 is schematically illustrated in Figure 
8(C). 

Figure 9 illustrates the timing, according to the 
example shown in Figure. 8(A), at which data read out 
from the motion compensation buffers 48-51 is rout- 
ed among the adders 57-60. 

The processing of the four blocks making up the 
macroblock proceeds, as indicated before, in parallel, 
with the respective first lines of each of the blocks be- 
ing processed simultaneously, then the second lines, 
and so forth. With respect to the first lines of the 
blocks, initially, at a starting time tl (Figure 9), data 
from motion compensation buffer 51 is routed to ad- 
der 57, data from motion compensation buffer 50 is 
routed to adder 58, data from motion compensation 
buffer 49 is routed to adder 59. and data from motion 
compensation buffer 48 is routed to adder 60. At a 
changeover point in the processing of the first lines, 
indicated by time t2 in Figure 9, the routing is changed 
so that data jfrom motion compensation buffer 50 is 
routed to adder 57, data from motion compensation 
buf fer 51 is routed to adder 58, data from motion com- 
pensation buffer 48 is routed to adder 59, and data 
from motion compensation buffer 49 is routed to ad- 
der 60. This routing state continues until the end of 
the first line (indicated by time t3) and then the pro- 
cedure that was followed for the first lines is carried 
out again with respect to the second lines. The same 
procedure is then continued through the nth lines, but 
upon completion of the nth lines of the block, as indi- 
cated at time t4, a different routing pattern is estab- 
lished for the beginning of the (n + 1)th lines. Accord- 
ing to this pattern, data from motion compensation 
buffer 49 is provided to adder 57, data from motion 
compensation buffer 48 is provided to adder 58, data 
from motion compensation buffer 51 is provided to 
adder 59, and data from motion compensation buffer 

50 is provided to adder 60. This routing arrangement 
continues until a changeover point in the (n + 1)th 
lines, indicated by time t5, at which the routing ar- 
rangement is changed so that data from motion com- 
pensation buffer 48 is routed to adder 57, data from 
motion compensation buffer 49 is routed to adder 58, 
data from motion compensation buffer 50 is routed to 
adder 59, and data from motion compensation buffer 

51 is routed to adder 60. On the completion of the 
process for the (n + 1 )th line (indicated by time t6), the 
procedure carried out for the (n + 1)th lines is repeat- 
ed with respect to each of the remaining lines of the 
blocks until the last (eighth) lines have been process- 
ed, at which point (indicated by time t7) processing for 
the macroblock is complete. Processing for the next 
macroblock then begins, on the basis of the motion 
vector associated with the next macroblock. 

It will be appreciated that the reference data sup- 
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ptied to the adders 50-60 is added by the adders to 
the current difference data supplied thereto from the 
processing circuits 39-42 so that macroblocks of re- 
constructed image data are produced. It will also be 
recognized that the storage of the reference data ac- s 
cording to the above-described checkered pattern in 
the frame memory 43. and the above-described 
method of reading out, buffering, and switching the 
reference data makes it possible to provide motion- 
compensation decoding processing without any re- 10 
striction on the range of the motion vector, and in such 
a manner that memory accesses do not overlap. 

In the embodiment illustrated in Figure 1, the MC 
switcher 52 is provided between the motion compen- 
sation buffers 48-51 and the adders 57-60. However, 15 
according to an alternative embodiment, shown in 
Figure 10, the MC switcher 52 can be provided be- 
tween the DRAMs 44-47 and the motion compensa- 
tion buffers 48-51, with each of the buffers 48-51 con- 
nected directly to, and providing data exclusively to, 20 
a re*;>ective one of the adders 57-60. 

A method of operating the embodiment illustrat- 
ed in Figure 10 will be described with reference to Fig- 
ures 11(A>-(C). 

Figure 11(A) is similar to Figure 8(A), and shows 25 
a square 84 which represents the geometric area cor- 
responding to the macroblock currently being proc- 
essed, motion vector 85 associated with the current 
macroblock, and a shaded square 86 which repre- 
sents the appropriate reference data for the current 30 
macroblock as indicated by the motion vector 85. It 
will also be noted that the reference data is distributed 
for storage among the DRAMs 44-47 in a block-wise 
manner according to the same checkered pattern 
shown in Figure 8(A). 35 

Under control of the motion compensation proc- 
essing blocks 53-56, and on the basis of the motion 
vector for the current macroblock, data is read out 
from the DRAMs 44-47 and routed to the motion com- 
pensation buffers 48-51 by the MC switcher 52 so that 40 
all of the reference data to be provided to the adder 
57 is stored in the motion compensation buffer 48, all 
of the reference data to be provided to the adder 58 
is stored in the motion compensation buffer 49, all of 
the reference data to be provided to the adder 59 is 45 
stored in the motion compensation buffer 50, and all 
of reference data to be provided to the adder 60 is 
stored in the motion compensation buffer 51. Refer- 
ring to Figures 11(A) and (B), it will be noted that the 
data represented by the upper left quadrant of the so 
shaded square 86 is stored in the motion compensa- 
tion buffer 48, the data represented by the upper right 
quadrant of the shaded square 86 is stored in the mo- 
tion compensation buffer 49, the data represented by 
the lower left quadrant of the shaded square 86 is 55 
stored in the motion compensation buffer 50, and the 
data represented by the lower right quadrant of the 
shaded square 86 is stored in the motion compensa- 
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tion buffer 51 . More specifically, during an initial read 
out period, data is simultaneously read out from all 
four of the DRAMs 44-47 and routed such that data 
from a portion of the DRAM 47 is stored in motion 
compensation buffer 48, while data from a portion of 
DRAM 46 is stored in motion compensation buffer 49, 
data from a portion of DRAM 45 is stored in motion 
compensation buffer 50, and data from a portion of 
DRAM 44 is stored in motion compensation buffer 51 . 
During a second read out period there is again simul- 
taneous reading out of data from the four DRAMs, but 
now the routing is such that data from a portion of 
DRAM 46 is stored in motion compensation buffer 48, 
data from a portion of DRAM 47 is stored in motion 
compensation buffer 49, data from a portion of DRAM 
44 is stored in motion compensation buffer 50, and 
data from a portion of DRAM 45 is stored in motion 
compensation buffer 51. Moreover, during a third 
read out period, again there is simultaneous read out 
from all of the DRAMs, but routing is performed so 
that data from a portion of DRAM 45 is stored in mo- 
tion compensation buffer 48, data from a portion of 
DRAM 44 is stored in motion compensation buffer 49, 
data from a portion of DRAM 47 is stored in motion 
compensation buffer 50, and data from a portion of 
DRAM 46 is stored in motion compensation buffer 51 . 
Then, during a final read out period, data is simulta- 
neously read out from four DRAMs and routed such 
that data from a portion of DRAM 44 is stored in mo- 
tion compensation buffer 48, data from a portion of 
DRAM 45 is stored in motion compensation buffer 49, 
data from a portion of DRAM 46 is stored in motion 
compensation buffer 50, and data from a portion of 
DRAM 47 is stored in motion compensation buffer 51. 

It will be observed that data from everyone of the 
four DRAMS is thus stored in each of the motion com- 
pensation buffers. Moreover, with reading of the data 
from the DRAMs and control of the MC switcher 52 
on the basis of the motion vector for the current mac- 
roblock, memory access can be performed without 
overlap. 

Also, because each of the motion compensation 
buffers are associated exclusively with a respective 
adder, and the reference data has been stored appro- 
priately therein, as shown in Figure 11(C), there is 
also no difficulty in accessing the motion compensa- 
tion buffers. 

There will now be described, with reference to 
Figures 12 and 13, in addition to Figure 10, an alter- 
native method of operating the embodiment of Figure 
1 0 so that the appropriate reference data is stored in 
each of the motion compensation buffers 48-51 . 

As indicated in Figure 12(A), according to this al- 
ternative method of operation, the reference data is 
distributed line-by-line among the DRAMS 44-47, 
rather than block-by-block, as in the technique shown 
in Figure 11(A). For example, referring again to Figure 
12(A), the data for the first line of each macroblock 
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(i.e.. the first line of the first and second blocks of the 
macroblock), is stored in DRAM 44, the second line of 
data of each macroblock is stored in DRAM 45, the 
third line of data for each macroblock is stored in 
DRAM 46, the fourth line of each macroblock is stored 
in DRAM 47, the fifth line of each macroblock is stor- 
ed in DRAM 44, and so forth, continuing in a cyclical 
fashion, line-by-line. It should be understood that the 
data for the ninth line of each macroblock (i.e., the 
first line of data in the third and fourth blocks of each 
macroblock) is stored in DRAM 44, whereas the data 
for the last line of each macroblock (i.e., the last line 
of the last two blocks of the macroblock) is stored in 
DRAM 47. Accordingly, the reference data is distrib- 
uted among the DRAM 44-47 according to a striped 
pattern, rather than the checkered pattern of Figure 
11(A). 

In Figure 12(A), the square labelled 87 repre- 
sents the geometric area which corresponds to the 
macroblock which is currently to be decoded, the mo- 
tion vector 88 is the motion vector associated with the 
current macroblock, and the square 89 represents the 
appropriate reference data for the current macro- 
block, as indicated by the motion vector 88. 

Figures 12(B) and Figure 13 indicate the sources 
of data and the timing according to which the appro- 
priate reference data is stored in the motion compen- 
sation buffers 48-51. As before, data is read out from 
the DRAMS 44-47 and routed by MC switcher 52 un- 
der the control of the motion compensation process- 
ing blocks 43-56 and on the basis of the motion vector 
for the current macroblock. 

In particular, during a first time slot, the reference 
data corresponding to the first line of the first block is 
read outfrom DRAM 47 and stored in motion compen- 
sation buffer 48. During the same time slot, reference 
data corresponding to the eighth line of the second 
block is read out from DRAM 46 and stored in motion 
compensation buffer 49. reference data for the sev- 
enth line of the third block is read out from DRAM 45 
and stored in motion compensation buffer 50. and ref- 
erence data for the sixth line of the fourth block is read 
out from DRAM 44 and stored in motion compensa- 
tion buffer 51. 

In the next (second) time slot, a one line shift in 
routing occurs, so that reference data for the second 
line of the first block is read out from DRAM 44 and 
stored in motion compensation buffer 48, reference 
data for the first line of the second block is read out 
from DRAM 47 and stored in motion compensation 
buffer 49, reference data for the eighth line of the third 
block is read out from DRAM 46 and stored in motion 
compensation buffer 50, and reference data for the 
seventh line of the fourth block is read out from DRAM 
45 and stored in motion compensation buffer 51 . 

The one-line shifts are continued in each of the 
succeeding six time slots so that the data is read out, 
routed and stored in the motion compensation buffers 



according to the pattern shown in Figures 12(D) and 
1 3. It will be observed that memory access occurs, as 
before, without overlapping. 

As a result, the reference data which is to be sup- 

5 plied to adder 57 is stored in motion compensation 
buffer 48, reference data which is to be supplied to ad- 
der 58 is stored in motion compensation buffer 49, ref- 
erence data which is to be supplied to adder 59 is stor- 
ed in motion compensation buffer 50, and reference 

10 data which is to be supplied to adder 60 is stored in 
motion compensation buffer 51. Again, there is no 
problem with overlapping memory accesses with re- 
spect to the motion compensation buffers. 

Although the above embodiments of the present 

is invention have been described with respect to a de- 
coding apparatus, it should be understood that the 
same could also be applied to a local decoder provid- 
ed in a data encoding apparatus. 

The moving picture video data decoding appara- 

20 tus provided in accordance with this invention distrib- 
utes an input data stream for parallel decoding proc- 
essing on the basis of synchronizing code signals 
present in the data stream, and the decoding proc- 
essing is continuously carried out within a time period 

25 between synchronizing codes. Accordingly, there is 
no limitation placed on the coding method with re- 
spect to time periods between synchronizing codes. 
Thus, parallel decoding processing can be carried out 
with respect to data that has been encoded by a con- 

30 ventional method, which difference-codes motion 
vectors, DC coefficients and the like on the basis of 
differences between a current block and a previous 
block. 

In addition, in the decoding apparatus provided in 

35 accordance with this invention, the blocks making up 
a macroblock are simultaneously processed in paral- 
lel so that video data that has been encoded by a con- 
ventional encoding method, without modification, 
can be reproduced at high speed. 

40 Furthermore, decoding of motion-compensation 

coded video data can be carried out with parallel read- 
out of reference data from a plurality of memory 
banks based on the same motion vector, so that a 
plurality of reference data memory banks and motion 

45 compensation circuits can be operated in parallel to 
carry out high speed processing on the basis of a con- 
ventional encoding method that is not modified by 
limiting the range of motion vectors, or by placing 
other limitations on motion prediction. 

50 As used in the specification and the following 

claims, the term "image frame" should be understood 
to mean a signal representing a picture upon which 
motion-compensated predictive coding is performed. 
As will be understood by those skilled in the art, such 

55 a picture may be formed, for example, of a progres- 
sive-scanned video frame, one field of an interlace- 
scanned video frame, or two fields which together 
make up an inter lace- scanned video frame. 

10 
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In at least preferred embodiments there is provid- 
ed a method and apparatus for decoding a video sig- 
nal in which a plurality of memory units and motion 
compensation devices are operated in parallel to 
process video data encoded according to a known 5 
standard, and without limiting the range of motion 
vectors used for predictive coding or requiring similar 
restrictions on motion predictive compression-cod- 
ing. 

Having described specific preferred embodi- 10 
ments of the present invention with reference to the 
accompanying drawings, it is to be understood that 
the invention is not limited to those precise embodi- 
ments, and that various changes and modifications 
may be effected by one skilled in the art without de- 15 
parting from the scope of the invention as defined in 
the appended claims. 



Claims 
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1. An apparatus for decoding a coded video signal 
that represents an image frame, said coded vid- 
eo signal having been divided into a plurality of 
slices (1 , 2, 3, 4), each of said slices being a se- 25 
quence of macroblocks (MB), each of said mac- 
roblocks being a two-dimensional array of picture 
elements of said image frame, said coded video 
signal being a bit stream that represents a se- 
quence of said slices which together represent 30 
said image frame, said bit stream including a 
plurality of synchronizing code signals, each of 
which is associated with a respective one of said 
slices for indicating a beginning of the respective 
slice, the apparatus comprising: 35 

a plurality of decoding means (30, 31, 32, 
33), each for decoding a respective portion of 
said coded video signal that represents said im- 
age frame; and 

distributing means (25) responsive to said 40 
synchronizing code signals for distributing said 
slices among said plurality of decoding means. 

2. An apparatus according to claim 1, wherein said 
plurality of decoding means is fewer in number 45 
than said plurality of slices into which said coded 
video signal which represents said image frame 

was divided, and said distributing means distrib- 
utes said slices in cyclical fashion among said de- 
coding means. 50 

3. An apparatus according to any one of claims 1 
and 2, wherein each of said slices represents a 
portion of said image frame which is one macro- 
block high and extends horizontally entirely 55 
across said image frame. 

4. An apparatus according to claim 3, wherein each 



of said macroblocks is a 16 x 16 array of said pic- 
ture elements. 

5. An apparatus for decoding input signal blocks 
that were formed by transform encoding and then 
variable-length encoding blocks of video data, 
the apparatus comprising: 

decoding means (30 t 31, 32. 33) for vari- 
able-length decoding a series of said input signal 
blocks; 

parallel data means (34) for forming plural 
parallel data streams, each of which includes re- 
spective ones of said series of input signal blocks 
which were variable-length decoded by said de- 
coding means; and 

a plurality of inverse transform means (39, 
40, 41 , 42) each for receiving a respective one of 
said parallel data streams and for performing in- 
verse transform processing on the variable- 
length decoded signal* blocks in the respective 
data stream. 

6. An apparatus according to claim 5, wherein said 
decoding means is one of a plurality of decoding 
means for variable-length decoding respective 
series of input signal blocks; and further compris- 
ing distributing means (25) for forming said re- 
spective series of input signal blocks to be decod- 
ed by said plural decoding means from a bit 
stream representing an image frame and in re- 
sponse to synchronizing signals provided at pre- 
determined intervals in said bit stream represent- 
ing said image frame. 

7. An apparatus for decoding an input digital video 
sig nal wh ich includes groups of blocks (83) of pre- 
diction-coded difference data, each of said 
groups consisting of a predetermined plurality of 
said blocks (MB) and having a respective motion 
vector (82) associated therewith, each of said 
blocks of prediction-coded difference data hav- 
ing been formed on the basis of the respective 
motion vector associated with the respective 
group which includes said block, the apparatus 
comprising: 

output means (39, 40, 41 , 42) for supplying 
in parallel blocks of prediction-coded difference 
data contained in one of said groups of blocks; 

reference data means (43, 53, 54, 55, 56, 
48, 49, 50, 51) for supplying in parallel plural 
blocks of reference data, each of said blocks of 
reference data being formed on the basis of the 
motion vector associated with said one of said 
groups of blocks and corresponding to one of said 
blocks of prediction-coded difference data sup- 
plied by said output means; and 

a plurality of adding means (57, 58, 59, 60) 
each connected to said output means and said 
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reference data means for adding a respective one 
of said blocks of prediction-coded difference data 
and the corresponding block of reference data. 

8. An apparatus according to claim 7, wherein each 
of said groups of blocks is a macroblock which in- 
cludes four blocks of prediction-coded data and 
said plurality of adding means consists of four ad- 
ders (57, 58, 59, 60) operating in parallel. 
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9. An apparatus according to any one of claims 7 
and 8, wherein said reference data means com- 
prises: 

a plurality of reference data memories (44, 
45, 46, 47) from which reference data is read out 15 
in parallel on the basis of said motion vector as- 
sociated with said one of said groups of blocks; 

a plurality of buffer memories (48, 49, 50, 
51). each for temporarily storing the reference 
cUiPi read out from a respective one of said plur- 20 
ality of reference data memories and for reading 
out the temporarily stored data on the basis of 
said motion vector associated with said one of 
said group of blocks; and 

distributing means (52) connected be- 25 
tween said buffer memories and said adding 
means for distributing among said plurality of 
adding means, on the basis of said motion vector 
associated with said one of said groups of blocks, 
the reference data read out from said plurality of 30 
buffer memories. 

10. An apparatus according to any one of claims 7 
and 8, wherein said reference data means com- 
prises: 35 

a plurality of reference data memories (44, 
45, 46, 47) from which reference data is read out 
in parallel on the basis of said motion vector as- 
sociated with said one of said groups of blocks; 

a plurality of buffer memories (48, 40, 50, 40 
51), each connected to a respective one of said 
adding means, for temporarily storing reference 
data read out from said plurality of reference data 
memories and for supplying the temporarily stor- 
ed reference data to its respective adding means; 45 
and 

distributing means (52) connected be- 
tween said reference data memories and said 
buffer memories for distributing among the plur- 
ality of buffer memories, on the basis of said mo- so 
tion vector associated with said one of said 
groups of blocks, the reference data read out 
from the plurality of reference data memories. 



11. An apparatus according to claim 10, wherein 
each of said buffer memories temporarily stores 
reference data read out from every one of said 
reference data memories. 



55 



12 



12. An apparatus according to any one of claims 7 to 
11, wherein said input digital video signal includes 
input signal blocks that were formed by transform 
encoding and then variable-length encoding 
blocks of prediction-coded difference data, and 
said output means comprises: 

decoding means (30, 31, 32, 33) for vari- 
able-length decoding a series of said input signal 
blocks; 

parallel data means (34) for forming plural 
parallel data streams, each of which includes re- 
spective ones of said series of input signal blocks 
which were variable-length decoded by said de- 
coding means; and 

a plurality of inverse transform means (39, 
40, 41 , 42) each for receiving a respective one of 
said parallel data streams and for performing in- 
verse transform processing on the variable- 
length decoded signal blocks in the respective 
data stream to form blocks of prediction-coded 
difference data that are supplied to said adding 
means. 

1 3. An apparatus according to claim 1 2, wherein said 
decoding means is one of a plurality of decoding 
means (30, 31, 32, 33) for variable-length decod- 
ing respective series of input signal blocks; and 
further comprising distributing means (25) for 
forming said respective series of input signal 
blocks to be decoded by said plural decoding 
means from a bit stream representing an image 
frame and in response to synchronizing signals 
provided at predetermined intervals in said bit 
stream representing said image frame. 

14. A method of decoding a coded video signal that 
represents an image frame, said coded video sig- 
nal having been divided into a plurality of slices 
(1, 2, 3, 4), each of said slices being a sequence 
of macroblocks (MB), each of said macroblocks 
being a two-dimensional array of picture ele- 
ments of said image frame, said coded video sig- 
nal being a bit stream that represents a sequence 
of said slices which together represent said im- 
age frame, said bit stream including a plurality of 
synchronizing code signals, each of which is as- 
sociated with a respective one of said slices for 
indicating a beginning of the respective slice, the 
method comprising the steps of: 

providing a plurality of decoding means 
(30, 31, 32, 33), each for decoding a respective 
portion of said coded signal that represents said 
video frame; and 

distributing said slices among said plural- 
ity of decoding means in response to said syn- 
chronizing code signals. 

15. A method according to claim 14, wherein said 
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plurality of decoding means is fewer in number 
than said plurality of slices into which said coded 
video signal which represents said image frame 
was divided, and said distributing step includes 
distributing said slices in cyclical fashion among 5 
said decoding means. 

16. A method according to claim 14, wherein each of 
said slices represents a portion of said image 
frame which is one macroblock high and extends 10 
entirely across said image frame. 

17. A method according to claim 16, wherein each of 
said macroblocks is a 1 6 x 1 6 array of said picture 
elements. 15 

18. A method of decoding input signal blocks that 
were formed by transform encoding and then va- 
riable-length encoding blocks of video data, the 
method comprising the steps of: 20 

variable-length decoding a series of said 
input signal blocks; 

forming plural parallel data streams, each . 
of which includes respective ones of said vari- 
able-length decoded series of input signal blocks; 25 
and 

performing, in parallel, inverse transform 
processing on the variable-length decoded signal 
blocks in the respective data streams. 

30 

19. A method according to claim 18, further compris- 
ing the steps of; 

forming in parallel plural series of input sig- 
nal blocks from a bit stream representing an im- 
age frame of input video signals and in response 35 
to synchronizing signals provided at predeter- 
mined intervals in said bit stream representing 
said frame of input signals; and 

variable-length decoding, in parallel, the 
plural series of input signal blocks. 40 

20. A method according to claim 1 9, further compris- 
ing the step of distributing variable-length decod- 
ed input signal blocks from every one of said plu- 
ral series of input signal blocks to each of said plu- 45 
ral parallel data streams. 

21. A method of decoding an input digital video signal 
which includes groups of blocks of prediction- 
coded difference data, each of said groups con- so 
sisting of a predetermined plurality of said blocks 

and having a respective motion vector associated 
therewith, each of said blocks of prediction-cod- 
ed difference data having been formed on the ba- 
sis of the respective motion vector associated 55 
with the respective group which includes said 
block, the method comprising the steps of: 

outputting in parallel blocks of prediction- 



r A2 24 

coded difference data contained in one of said 
groups of blocks; 

reading out in parallel from memory, on 
the basis of the motion vector associated with 
said one of said groups of blocks, plural blocks of 
reference data, each of said blocks of reference 
data corresponding to one of said blocks of pre- 
diction-coded difference data; and 

respectively adding, in parallel, the blocks 
of prediction-coded difference data contained in 
said one of said groups of blocks and the corre- 
sponding blocks of reference data. 

22. A method according to claim 21, wherein said 
reading out step comprises the sub-steps of: 

reading out the reference data from a plur- 
ality of memories on the basis of the motion vec- 
tor associated with said one of said groups of 
blocks; 

distributing, on the basis of the motion 
vector associated with said one of said groups of 
blocks, the reference data read out from the plur- 
ality of memories; 

temporarily storing the distributed refer- 
ence data; and 

reading out the temporarily stored data. 

23. A method according to any one of claims 21 and 
22, wherein said input digital video signal in- 
cludes input signal blocks that were formed by 
transform-encoding and then variable-length en- 
coding blocks of prediction-coded difference 
data, said outputting step comprising the sub- 
steps of: 

variable length decoding a series of said 
input signal blocks; 

forming plural parallel data streams, each 
of which includes respective ones of said vari- 
able-length decoded series of input signal blocks; 
and 

performing, in parallel, inverse transform 
processing on the variable-length decoded signal 
blocks in the respective data streams. 

24. A method according to claim 23, further compris- 
ing the steps of: 

forming in parallel plural series of input sig- 
nal blocks from a bit stream representing an im- 
age frame of input video signals and in response 
to synchronizing signals provided at predeter- 
mined intervals in said bit stream representing 
said frame of input signals; and 

variable-length decoding, in parallel, the 
plural series of input signal blocks. 

25. A method of decoding a prediction-coded video 
signal that represents an image frame, said pre- 
diction-coded video signal having been divided 
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into a plurality of macroblocks, each of said mac- said macroblocks is composed of sixteen lines, 

roblocks being a two-dimensional array of picture and said number of memories is four, 

elements of said image frame, the method com- 
prising the steps of: 

providing a plurality of memories each for s 
storing reference data which corresponds to a re- 
spective portion of said image frame, said plural- 
ity of memories together storing reference data 
which represents a complete image frame; and 

distributing data representing a recon- 10 
structed image frame for storage in said plurality 
of memories such that a portion of each macro- 
block of the reconstructed image frame is stored 
in each of said plurality of memories. 

15 

26. A method according to claim 25, wherein said 
macroblocks are each composed of a predeter- 
mined number of two-dimensional blocks and 
each of said plurality of memories stores corre- 
sponding blocks from all of the macroblocks of an 20 
image frame. 

27. A method according to claim 26, wherein said 
plurality of memories consists of first, second, 
third and fourth memories, said macroblocks are 25 
each composed of four blocks which respectively 
represent upper left, upper right, lower left and 
lower right quadrants of the respective macro- 
block, and said distributing step comprises: 

storing in the first memory the blocks rep- 30 
resenting the upper left quadrants of all of the 
macroblocks; 

storing in the second memory the blocks 
representing the upper right quadrants of all of 
the macroblocks; 35 

storing in the third memory the blocks rep- 
resenting the lower left quadrants of all of the 
macroblocks; and 

storing in the fourth memory the blocks 
representing the lower right quadrants of all of the 40 
macroblocks. 

28. A method according to any one of claims 25, 26 
and 27, wherein said distributing step comprises 
storing a first line of each of said macroblocks in 45 
a first one of said plurality of memories and stor- 
ing a second line of each of said macroblocks in 

a second one of said plurality of memories. 

29. A method according to claim 28, wherein each of 50 
said macroblocks is composed of a number of 
lines that is an integral multiple of a number of 
memories that forms said plurality of memories, 

and said distributing step comprises distributing 
said lines of each macroblock in cyclical fashion 55 
among said memories. 



30. A method according to claim 29, wherein each of 
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cry made up of a sequence of macroblocks (MB). 

^ The signal to be decoded is slice-wise divided 
for parallel variable-length decoding. Each van- 

^ able-length-decoded macrobiock is divided into 
its constituent blocks for parallel inverse trans- 
form processing. Resulting blocks of difference 
data are added in parallel to corresponding 

CO blocks of reference data. The blocks of refer- 

0 ence data corresponding to each macrobiock 
are read out in parallel from reference data 

Q. memories (44, 45, 46, 47) on the basis of a 
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